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Abstract. We describe an objective and automated 
method for detecting clusters of galaxies from optical 
imaging data. This method is a variant of the so-called 
'matched-filter' technique pioneered by Postman et al. 
(1996). Simultaneously using positions and apparent mag- 
nitudes of galaxies, this method can, not only find cluster 
candidates, but also estimate their redshifts and richnesses 
as byproducts of detection. We examine errors in the es- 
timation of cluster's position, redshift, and richness with 
a number of Monte Carlo simulations. No systematic dis- 
crepancies between the true and estimated values are seen 
for either redshift or richness. For clusters with 2;=0.2 and 
with richness similar to that of the Coma cluster, typical 
errors in the estimation of position, redshift, and richness 
are evaluated as A9 10" (one third of the projected core 
radius), Az ~0.02, and AN/N -12%, respectively. Spuri- 
ous detection rate of the method is about less than 10% of 
those of conventional ones which use only surface density 
of galaxies. A cluster survey in the North Galactic Pole re- 
gion is executed to verify the performance characteristics 
of the method with real data. Despite poor quality of the 
data, two known real clusters are successfully detected. No 
unknown cluster with low or medium redshift (z <0.3) is 
detected. We expect these methods based on 'matched- 
filter' technique to be essential tools for compiling large 
and homogeneous optically-selected cluster catalogs. 
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1. Introduction 

Studies on clusters of galaxies provide us with valuable 
information on cosmology and extragalactic astronomy. It 
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is a common way in most of the studies to collect sam- 
ples from available catalogs. For example, Bahcall (1988) 
compiled the previous work on two-point angular cluster- 
cluster correlation function based on published cluster 
catalogs. Rhoads, Gott and Postman (1994) measured a 
genus curve of Abell clusters for topological studies on 
the large-scale structure of the Universe. Struble and Fta- 
clas (1994) studied correlations amongst richness, flatten- 
ing, and velocity dispersion of 350 Abell clusters. A large 
number of reports have also been made on the relations 
between various properties of clusters (Henry and Tucker 
1979; Edge and Stewart 1991; Lubin and Bahcall 1993; 
Annis 1996; and references therein). Multicolor photom- 
etry reveals the color evolution of individual galaxies in 
clusters: Butcher and Oemler (1978, 1984) reported an in- 
creasing fraction of 'blue' galaxies in clusters with redshift. 
This is known as 'Butcher-Oemler effect' and thought to 
be some sign of galaxy evolution (see also Rakos and 
Schombert 1995). Anyhow, it is indispensable to use large 
and statistically complete catalogs of clusters for statisti- 
cal investigations. 

Clusters of galaxies are identified not only as 'clus- 
ters of galaxies' as it is but also as hot plasma balls. Ac- 
cordingly, both optically- and X-ray-selected cluster cat- 
alogs have been constructed so far. A number of X-ray 
clusters were detected by Extended Medium Sensitivity 
Survey with Einstein Observatory (Gioia et al. 1990) and 
i?05'^T All-Sky Survey (Voges et al. 1996). X-ray surveys 
enable one to produce almost complete catalogs of nearby 
{z ^0.2) clusters since it is easy to detect clusters as they 
are extended X-ray sources. At present, however, it is quite 
difficult to execute a deep X-ray survey over a wide area 
in the sky to assemble a sufficiently large and complete 
sample of distant X-ray clusters whereas searching with 
optical data can reach even more distant clusters. 

Let us outline the development of optical cluster- 
finding techniques and the optically-selected cluster cata- 
logs themselves in approximately historical sequence. Cat- 
alogs of nearby (z ^0.2) clusters include those compiled 
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by Abell (1958), Zwicky et al. (1961-68), Shectman (1985, 

hereafter S85), Abell, Corwin and Olowin (1989, hereafter 
AGO), Lumsden et al. (1992, hereafter L92), and Dalton et 
al. (1994, hereafter D94). For more distant (0.2< z <0.9) 
ones, there exist four catalogs; Gunn. Hocssel and Okc 
(1986), Gouch et al. (1991), Postman et al. (1996, hereafter 
P96), and Lidman and Peterson (1996, hereafter LP96). 
All the above catalogs except for S85, L92, D94, P96, and 
LP96 were constructed by eye selection of clusters on pho- 
tographic plates. We can easily imagine that large efforts 
were required to assemble these catalogs. However, these 
catalogs are claimed to suffer from inhomogeneity and con- 
tamination: a significant fraction of clusters may be missed 
(for Abell/AGO catalog, see Gunn, Hoesscl and Okc 1986; 
Sutherland 1988; Ebeling et al. 1993) while some of the 
cataloged clusters may be spurious (Lucey 1983). These ef- 
fects become much more critical for fainter (namely, more 
distant and/or poorer) ones. 

S85, L92, and D94 detected clusters semi-objectively 
(L92 and D94 did it also automatically): S85 and L92 
employed count-in-cells technique while D94 adopted per- 
colation technique. Yet, both techniques use only pro- 
jected positions of galaxies and simply pick up overdensi- 
ties in the two-dimensional distribution of galaxies. Gon- 
sequently they cannot quantify the detection rate of spu- 
rious clusters due to chance coincidence of galaxies on the 
sky. Gollins et al. (1995) and Ebehng and Maddox (1995) 
reported the significant amounts of contamination in the 
catalogs compiled by L92 and D94, respectively. Further- 
more, they pick up overdensities of galaxies within the area 
of a fixed apparent angular size, despite that the actual 
angular extension of clusters undoubtedly changes with 
distance. This means that cluster-finding criteria in these 
methods do change with redshift. Thus these catalogs may 
not be regarded as far more objective than the 'classical' 
ones such as Abell/AGO catalog. 

Escalera and MacGillivray (1995, 1996) have searched 
for structures of various scales, from groups up to super- 
clusters, using wavelet transform. Wavelet transform does 
not stick to a certain apparent size of structures and en- 
ables one to execute a 'multi-scale' analysis. However, us- 
ing only galaxy positions on the sky, wavelet transform 
also cannot quantify spurious detection rate. Dividing the 
total sample into subsamples with small ranges of mag- 
nitude, as Escalera and MacGillivray (1996) did, may 
somewhat suppress spurious detections in such methods 
as count-in-cells, percolation, and wavelet transform, but 
at the same time, it may also reduce real signal. 

P96 developed and employed an innovative cluster- 
finding method based on 'matched-filter' technique. The 
point of their method is to use both projected positions 
and apparent magnitudes of galaxies simultaneously. This 
enables one to obtain rough estimates of redshifts and rich- 
nesses for detected clusters without any spectroscopic in- 
formation. We need only one broad band images, while 
obtaining photometric redshift requires more than three 



bands. The catalog by P96 contains 79 distant clusters 
(0.2 < z < 1.2) from V and / band data over 5 deg^ 
obtained with 4-Shooter GGD camera (Gunn et al. 1987) 
attached to Palomar 5m Hale telescope. It is noted here 
that all the above cluster catalogs except for the one by 
P96 were based on photographic plates. Although GGDs 
appeared as new optical detectors taking the place of pho- 
tographic plates in 1980s, it was extremely time consum- 
ing to make use of them for survey observations because of 
their small sizes. However, recent developments of large- 
format GGDs and of GGD mosaic cameras made it pos- 
sible to quickly survey over a wide area (~some deg^) on 
the sky and to obtain large amount of data of good qual- 
ity. The cluster catalog compiled by P96 is also the first 
GGD-based cluster catalog. Using similar methods, LP96 
conducted a search for distant clusters and built a cata- 
log of 105 candidates from / band GGD data covering 13 
deg^, obtained with Anglo- Australian Telescope. 

Prompted by the work of P96, we developed a variant 
method for automatic and objective cluster-finding with 
optical imaging data. Our method has some nontrivial dif- 
ferences from the one by P96 in the details of detection 
process, such as binning the input data and employing 
Poisson statistics. These differences have made apparent 
improvements in processing time and in accuracies in esti- 
mating redshift. In particular, the systematic discrepancy 
between true and estimated values found in P96 has been 
largely reduced. 

In Sect. 2, we discuss the principle of the cluster- 
finding method. Detailed performance tests of the method 
are described in Sect. 3. In Sect. 4, as a performance ver- 
ification test with real data, we perform a cluster sur- 
vey using the B band galaxy samples within the 4.9 deg^ 
region around the North Galactic Pole (NOP), obtained 
with our Mosaic GGD Gamera attached to 1.05m Schmidt 
Telescope at Kiso Observatory, Japan. 

Throughout this paper, we assume Hq = 80 km s~^ 
Mpc~^ and qq = 0.5. 

2. The method 

We detect clusters with maximum likelihood method in a 
way similar to that by P96, using models of surface density 
and apparent magnitude distribution of cluster galaxies 
and field galaxies. The cluster model has two free param- 
eters, namely, its redshift and richness. 

2.1. Models 
2.1.1. Gluster 

We assume spherical symmetry for simplifying the clus- 
ter model. The radial distribution of the galaxies in 
the model cluster matches the King model with c = 
log (Uidai /''core) = 2.25 (King 1966; Ichikawa 1986) and 
''core = 170/i~^kpc (Girardi et al. 1995), where rcore and 
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''tidal are core radius and tidal radius, respectively. The lu- 
minosity function of the galaxies obeys the Schechter func- 
tion (Schechter 1976) with a= -1.25 and M% = -20.4 
(converted from M^^ = —20.12 by Colless 1989, rescaUng 
Hq and using Mb, = Mb — 0.18 by Yoshii and Takahara 
1988). Morphological type mixture, luminosity segrega- 
tion, substructure, and influences by cD galaxies are not 
considered. To compute apparent features of the model 
cluster, two more parameters, namely redshift zai (here- 
after we call it 'filter redshift' after the manner of P96) and 
richness N, must be assigned. Obviously, if Zfn is larger, 
angular extension of the model clusters becomes smaller 
and member galax;ies become fainter. Dimming by the K- 
correction effect is assumed to be KB{zfii) = 4.7 x Zfn 
(for Zfii ^0.6), which is the value for elliptical galaxies 
(Fukugita, Shimasaku and Ichikawa 1995). We define N 
to be the number of all galaxies brighter than {M* + 5). 
The parameter N roughly represents the population of 
bright, giant galaxies in a cluster, ignoring dwarf galaxies 
whose natures such as spatial distribution or luminosity 
function are still unclear. 



annular regions centered on the position. Their angular 

inner radii and widths arc 9i and A9i{l < i < ne) , respec- 
tively. By counting the number of galaxies which fall in 
rim magnitude bins (m^ <m<mj+ Amj (1 < j < Um)) 
for each annular region, we obtain an array Oij (1 < ? < 
^0)1 £ J £ n-m) consisting of ng x Um galaxy numbers. 

On the other hand, we can calculate an equivalent ar- 
ray Mjj for the model galaxies. is described as 



Mij=NCij+Fij, 



(1) 



where Cij is an array for member galaxies in a normal- 
ized model cluster located at the center of ne concentric 
annular regions, and Fij is an array for field galaxies. Cij 
is written as 



'ij =2w 0ac{0) d0 / (pc{m) dm, (2) 

J 9i J rtij 



where add) is surface density profile and (pdm) is differ- 
ential luminosity function of cluster galax;ies. Both ac(0) 
and (j>c{Tn) depend on filter redshift and are normalized as 



2.1.2. Field 

We assume that field galaxies are randomly distributed on 

the sky, namely, angular two-point correlation function is 
not considered. For simulations in Sect. 3, we adopt deep 
galaxy number count data by Metcalfe et al. (1995) as the 
model of apparent magnitude distribution of field (fore- 
ground and background) galaxies. For the actual galaxy 
data discussed in Sect. 4, we use the magnitude distribu- 
tion of all galaxies in the survey area to search clusters as 
if it were that of pure field galaxies. This causes an over- 
estimate of the number of field galaxies, especially when 
the survey area is small and there is a cluster covering 
a bulk portion of the area by chance. Hence we have to 
deal with an enough large area so that clusters or even a 
large-scale structure will not seriously affect the estimate 
of the number of field galaxies in the area. Yet sometimes 
iteration would be needed. For that case, we mark conspic- 
uous 'cluster' regions by referring to the first-time result 
and then execute the second calculation using the 'more 
accurate' field galaxy sample in the area except for the 
'cluster' regions. 



2tt [ 6ac{e) d9 = 1 
Jo 



and 



0c(w) dm = 1. 



(3) 



(4) 



Fij is written as 



F- ■ 



}j = 2iraf ede (l)f{m) dm, (5) 



where a/ is surface density and (t>f{m) is differential lumi- 
nosity distribution of field galaxies. 

The logarithmic likelihood is given by 

= {Oi, ln(iV Cij + Fij) - [N a, + F,,) - ln(Oi,!)} .(6) 



2.2. Algorithm 

What we need at the beginning is just a usual catalog 
of galaxies containing projected positions and apparent 
magnitudes. Fig. 1 shows an example galaxy distribution. 
The galaxies are generated by a Monte Carlo simulation 
based on the model described in Sect. 2.1. A cluster with 
{z,N) = (0.20, 1000), roughly equal to an Abell richness 
class 0-1 cluster, is located at the center. 

Next we compute the likelihood C that a cluster is 
present at a particular point. We consider ne concentric 



Here we assume that Oij obeys Poisson statistics since 
their values amount to about 10 or less for typical clusters 
at z ^0.2 with our choice of values for A9i and Amj (see 
the next section). If we assume Gaussian distribution for 
Oij , Eq. (6) becomes simply equivalent to — (P96 did so, 
expecting that there would be enough background galax- 
ies. See Eq. 12 of their paper). However, this assumption 
leads us to overestimating N about 20% of the true value 
for the case of {z, A^)=(0.20,1000). This is because Poisso- 
nian distribution is not symmetrical and has a longer tail 
toward larger value. The use of Poisson statistics helps to 
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Fig. 1. Galaxy distribution in an area containing an artificial 
cluster with (z, N) = (0.20, f 000) at the center. The symbol 
size changes with the apparent magnitude. The largest and 
smallest symbols correspond to ms = 16.0 and 23.5, respec- 
tively. 



reduce the possible systematic error in the estimation of 
the redshift and the richness. 

Eq. (6) is a fimction of both filter redshift Zfn and rich- 
ness A'^. In order to simplify calculations, we first fix zm 
to a certain value and maximize £ by optimizing only 
N. The partial derivative of the logarithmic likelihood (6) 
with respect to N is 

Eq. (7) is apparently a monotonically decreasing function 
of iV. If we find a certain richness value iVp for which Eq. 
(7) becomes zero, C has a peak value Cp a.t N = Np. 
Computing Cp and A'p at every point in the whole image, 
we obtain a 'likelihood image' Cp{x,y) and a 'richness 
image' Np{x,y) for the fixed filter redshift. 

Figs. 2a and 2b, respectively, show the 'likelihood im- 
age' and the 'richness image' for Zfn = 0.20 generated from 
the galaxy distribution shown in Fig. 1. We can recognize 
the existence of a cluster by a peak in both images. How- 
ever, appearances of the peaks are quite different. While 
the peak in the 'richness image' is simple and very promi- 
nent, there exists a ring- like region of slightly lower likeli- 
hood around a weak peak in the center of the 'likelihood 
image', and Cp increases again toward further out of the 
ring. This is because it is difficult to discriminate a clus- 
ter with very small N from 'field'. Though only 'likelihood 
image' is theoretically needed to detect clusters, peaks in 
'likelihood image' corresponding to clusters are often very 
obscure as seen in Fig. 2a. We therefore find a peak in 
'richness image' at first, and then check if there is also a 
peak in corresponding 'likelihood image'. If a peak exists 



Fig. 2. 'Likelihood image' (upper panel (a)) and 'Richness 
image' (lower panel (b)) of the artificial cluster in Fig. 1 for 
zm = 0.20. 

at nearly the same point in both images, we regard it as 
a cluster candidate. 

We obtain several pairs of 'likelihood image' and 'rich- 
ness image' and find peaks in both images for other filter 
redshifts. Then we plot the peak Cp and the peak A^^p as 
functions of filter redshift for each cluster candidate. An 
example is shown in Figs. 3a and 3b, respectively. If we 
find a peak in Cp — zm plot (Fig. 3a), zm at the peak is the 
redshift estimate of the cluster candidate (hereafter 2;est)- 
Once Zest is obtained, A^p for that Zest (hereafter A'est) can 
also be found as shown in Fig. 3b. Some cluster candi- 
dates do not show any remarkable single peak of Cp in 
the Cp — Zfii plot. Such candidates may be spurious. 

3. Performance test 

We examine the performance of our method described in 
Sect. 2 by Monte Carlo simulations. The errors in esti- 
mates of position, redshift, and richness, missing rate of 
existing clusters (incompleteness), and spurious detection 
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rate are investigated. Some comparison of this method 

with that by P96 is also discussed. In this section, we 
adopt 9i = 0, A6 = 2rcoie/ dA{z = 0.15) where dA{z) is 
angular diameter distance, no = 5, mi (in the B band) = 
14.0, Am = 0.5, and Um = 19. Limiting magnitude is set 
to mB=23.5. 



3.1. Estimates of position, redshift, and richness 

3.1.1. Monte Carlo simulation 

When a cluster is detected, its projected position, red- 
shift, and richness are estimated. Errors in these estimates 
depend not only on the real redshift and richness (here- 
after Zrcai and A'rcai, respectively), but also on limiting 
magnitude, color band, and the Galactic absorption. To 
evaluate the dependence on z^cai and A^roai. wc examine 
20 cases wfth Zrcai={0.16, 0.20, 0.24, 0.28} for iV=300 
and Zrcai={0.16, 0.20, 0.24, 0.28, 0.35, 0.40, 0.45, 0.50} 
for A^={1000, 3000}. In the present study we limit the 
redshift range to z < 0.5, for which we expect that ample 
data will be available in the near future. For each case, 500 
artificial B band galaxy samples are generated by Monte 
Carlo simulation according to the model described in Sect. 
2. iV=300 corresponds to MKW-AWM systems (Bahcall 
1980), A''=1000 corresponds to Abell richness class 0-1, 
and iV=3000 corresponds to Abell richness class 2 (similar 
to the Coma cluster) . The relationship between our A'^ and 
Abell richness parameter c = Nm3<m<m3+2 is presented 
in Appendix. The Galactic absorption is not considered. 

3.1.2. Position 

We measure angular distance between the true position of 
the cluster center xq and the estimated position x jv where 



A'p is maximum in the 'richness image' for 



Properly 



speaking, we must use the position xc corresponding to 
peak Cp, rather than peak Np. It is, however, much easier 
to detect a peak in the 'richness image' than in the 'likeli- 
hood image' as described in Sect. 2. Since ccjv is actually 
close enough to xc (separation is much less than the core 
radius), there is almost no problem to use x^. The es- 
timated positions are distributed around Xq and are well 
fit by two-dimensional Gaussian distribution. Fig. 4 shows 
the values of (Test of the best-fit Gaussians normalized by 
the angular core radius. The errors in the estimations are 
about ^core, 0.5 ^core; and 0.3 ^core for N = 300, 1000, and 
3000, respectively. These values are quite small compared 
with the angular extensions of the clusters themselves. 



Fig. 3. Upper panel (a) displays peak logarithmic likelihood 
as a function of filter redshift for the artificial cluster in Fig. 
1. Lower panel (b) shows peak richness as a function of filter 
redshift for the same data. 



3.1.3. Redshift and richness 

Fig. 5 shows the result of redshift and richness estimations 
for the nearby 12 cases of the artificial clusters. The plus 
marks indicate the most probable values and the two con- 
tours represent 68% and 95% confidence levels. Three sets 
of a plus mark and two contours in each panel are for N = 



Fig. 4. Errors in position estimation (Test normalized by the 
angular core radius Score as a function of cluster redshift. Filled 
circles and solid line are for clusters with N = 3000, open circles 
and dashed line for N = 1000, and open triangles and dotted 
line for A'^ = 300. How much fainter we can observe than m* 
is shown at the top. 
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Fig. 5. Errors in estimates of redshift and richness. In each 
panel of a-d, throe eases corresponding to N — 3000, 1000 and 
300 are shown. The plus mark means the most reliable value. 
Inner and outer contours around the plus mark show 68% and 
95% confidence levels, respectively. 



Fig. 6. Upper panel (a) shows errors in redshift estimation, 
while lower panel (b) shows errors in richness estimation. The 
error bars represent ±1(7, corresponding to the inner contours 
in Fig. 5. Filled circles and solid line are for clusters with N 
= 3000, open circles and dashed line for N = 1000, and open 
triangles and dotted line for N = 300. 



300, 1000, and 3000. The contours are all elongated in the 
direction from the bottom left to the upper right. This is 
because the estimation of redshift and that of richness are 
coupled with each other. That is, a rich cluster at a large 
distance looks similar to a less rich, nearer cluster. 

The direction of the largest dispersion in the distribu- 
tions of 500 points of (zest) -^est)) namely, the direction of 
the major axis of the contours in Fig. 5, differs amongst 
clusters of different richnesses. This is due to different rel- 
ative ratio of number of cluster galaxies to that of field 
galaxies within cluster region. Figs. 6a and 6b show ac- 
curacies in the estimates of redshift and richness, respec- 
tively, for all the 20 cases. Error bars mean the widths 
of 68% confidence contours in Fig. 5, projected onto the 
corresponding axis. Errors in the estimates of redshift 
and richness at z=0.2 are, respectively, about 0.02 and 
12% for Ar^eai=3000 clusters and about 0.04 and 30% for 
-^roai=1000 clusters. No systematic deviations from true 
values are seen. Thus, redshift and richness estimations 
by this method go fairly well without any spectroscopic 
information. 



These errors are internal. In practice, there exist ex- 
ternal errors in addition to the internal ones investigated 
above, owing to intrinsic properties of real clusters: dis- 
persion in M* values, variations in shapes of luminosity 
functions and surface density profiles, elongation of clus- 
ters, substructures, overlapping with other clusters along 
the line of sight, etc.. These uncertainties will affect the 
estimations of Zgst and A^est- Moreover, for very distant 
{z ^ 1) ones, systematic evolutions of cluster galaxies or 
evolution of clusters themselves may also affect the esti- 
mates. 

The most direct and serious effect on the redshift es- 
timation comes from the dispersion in M* . Colless (1989) 
evaluated the upper limit of dispersion in M* to be 0.4 
mag, which corresponds to a redshift estimation error of 
Az ~ 0.03 in B band. For other uncertainties, it is dif- 
ficult to quantitatively evaluate their effects on redshift 
and richness estimations. Intrinsic properties of real clus- 
ters are still unclear. Therefore, we should rather study 
them in more detail after obtaining a 'large and statisti- 
cally complete' cluster catalog by an 'objective' cluster- 
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finding metliod sucli as the present one by changing pa- 
rameters of cluster models. Spectroscopic observations are 
also needed to verify the results of redshift estimations and 
to study M* values and its dispersion, evolution, etc.. Sev- 
eral times of iterations would be needed to establish both 
a really objective cluster catalog and a really objective 
cluster-finding technique. 

3.2. Incompleteness 

For a real but very faint (poor and/or distant) cluster, we 
may miss either the likelihood peak or the richness peak 
or both. To evaluate probabilities of missing real clusters, 
we again use the 20x500 artificial clusters. We find that 
our cluster-finding technique can detect almost all clusters 
up to Zrea.\ ~ 0.30. In the case of A'^reai=3000, the miss- 
ing probabilities do not exceed 0.2% (namely, no cluster 
in 500 samples is missed) at Zreai <0.35. Then the num- 
ber of missed clusters begins to increase up to ~5% at 
•2roai=0.50. In the case of A^rcai=1000, incompleteness ap- 
pears at 2:real=0-28 and grows up to ~15% at 2:real=0-50. 

Even for poor (A?reai=300) clusters, only 8-15% are missed 
in the range of 0.16< Zroai <0.28. 

Gunn, Hoessel and Oke (1986) pointed out the large 
incompleteness of the Abell catalog at 2; ~ 0.30. There 
are only 8 Abell clusters in the regions they observed, 
although they estimated that about 150 clusters exist up 
to the redshift limit of 0.30. Complete sampling of distant 
clusters is indispensable for the correct understanding of 
their nature. 

3.3. Spurious detection rate 

We study the detection rate of non-physical (spurious) 

clusters using artificial random distribution of galaxies. 
Of course, the actual field galaxies have non-zero angular 
correlation function (e.g., Davis and Peebles 1983). There- 
fore the actual spurious detection rate may be slightly dif- 
ferent from those based on random distribution. Even if 
the distribution of field galaxies is random, certainly there 
exist some galaxy clumps by projection effects. Searching 
for clusters by simply finding overdensities of galaxies on 
the sky will result in detecting a number of such spurious 
ones. Here we display how well we can suppress spurious 
detections by taking into account magnitude information 
and projected positions simultaneously. 

We evaluate the spurious detection rate with 1000 sets 
of artificial 50' x 50' 'field' data which do not contain any 
clusters. The limiting magnitude in the B band is 23.5. In 
order to evaluate the spurious detection rate rigorously, it 
is necessary to obtain (zest) -^est) of all spurious clusters. 
However, since this is a time-consuming task, we adopt a 
simpler approach to roughly estimate the upper limit of 
spurious detection rate here. 

In a 'richness image' for a given filter redshift, we sim- 
ply count the number of 'richness peaks' which exceed 



a given threshold value A^th and are not separated more 

than S^corc from the corresponding 'likelihood peak's. We 
perform this task for the 1000 artificial 'fields'. The dis- 
tribution of the 1000 'richness peak's (per 50' x 50' area) 
is very well fit by Poissonian distribution. We compute 
the best-fit Poissonian mean value (A) with least squares 
method. Then we convert the A to the value per deg^ and 
simply regard it as an upper limit of spurious detection 
rate. In Fig. 7, we show the upper limits of spurious de- 
tection rate for four thresholds (A'th=200, 300, 400, and 
500) as a function of filter redshift by solid lines. 

To compare these values with those by a traditional 
method, we calculate the spurious detection rates by 
count-in-cells technique with cell's size of 20core for 2.5cr 
and 3fT levels {a is the standard deviation of the distri- 
bution of the number of galaxies per cell). They are also 
shown in Fig. 7 by dashed lines. It is clearly seen that the 
use of magnitude information remarkably suppresses the 
spurious detection rate, especially at lower redshift. 

Moreover, the values represented by solid lines in Fig. 

7 are just upper limits. We can further suppress the spuri- 
ous detection rate by examining the shape of the Cp — Zfu 
curve. For the most of spurious clusters, the Cp—zm curves 
(e.g., Fig. 3a) do not have a single peak and are some- 
times very noisy so that we can exclude these cluster can- 
didates as 'junks' from the resulting cluster catalog. For 
some of the others, however, the Cp — Zfn curves have a 
good-looking peak just like the one seen in Fig. 3a. These 
are 'really spurious' clusters, which we can not discrimi- 
nate from real clusters even with additional information 
of galaxy magnitudes. 

Let us roughly estimate the numbers of 'really spuri- 
ous' clusters with z=0.16, 0.20, 0.24, and 0.28. For the 
case of 2;=0.16, first we randomly select 10 spurious clus- 
ter candidates in the 'richness images' for Zfii=0.16. Then 
we examine their Cp — Zfa curves to find ones with good- 
looking peaks, and count the number of 'really spurious' 
clusters, ^est of which falls into 0.16±0.02 (0.02 is half of 
the interval of Zfa for which likelihood values are actu- 
ally computed). The numbers of 'really spurious' clusters 
are found to be 3 and 1 for 2;fii=0.16 and 0.20, respec- 
tively. No 'really spurious' clusters are found for 2;fii=0.24 
and 0.28. Thereby the correct spurious detection rate goes 
down to much lower than the upper limit: it is about 30% 
at z=0.16, 10% at z=0.20, and less than 10% at z=0.24 
and 0.28, of the values shown as the solid lines in Fig. 7. 

Here we examine spurious detection only by simple sta- 
tistical projection effect. In addition to this case, overlaps 
of two or more poor groups, superpositions of field galaxies 
on poor groups, and small clumpy portions in outskirts of 
nearby large clusters (see Sect. 4) also contribute to spuri- 
ous detection. For these cases, Cp — Zfii curves will also be 
very noisy or have several peaks or no peak. Such cluster 
candidates can easily be excluded from the resulting cat- 
alog or checked off as doubtful ones. Only spectroscopic 
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Fig. 7. Spurious detection rates per deg'^ as functions of fil- 
ter redshift. Solid lines show the results (upper limit) of the 
present method, while dashed lines show those of count-in-cells 
technique. 

observations of the galaxies of these cluster candidates can 
reveal what in fact they are. 

Even for conspicuous galaxy clumps found by simple 
glances at galaxy distributions, sonic; of them eventually 
turn out to be spurious. On the other hand, some marginal 
concentration of galaxies are identified as real clusters. Us- 
ing projected positions and magnitude simultaneously, we 
often obtain quite different results from those by intuitive 
methods which use only projected distributions of galax- 
ies. In other words, wc can quite easily identify a number 
of non-physical clusters which we can never discriminate 
without magnitude information. 

3.4- Comparison with the method by P96 

This method is a variant of the one by P96. The basic 
idea is identical, but there are two main differences in the 
actual procedures. 

The first one is the form of likelihood function. While 
our likelihood function is based on Poisson statistics (Eq. 
6), the one employed by P96 (Eq. 15 of their paper) is 
based on Gaussian statistics, namely, P96's likelihood is 
proportional to x^. Of course, the equation is valid, as 
they say, when there are sufficient number of background 
galaxies. However, especially when the limiting magni- 
tude is brighter and there are fewer galaxies, the adoption 
of Gaussian statistics becomes unsuitable. Moreover, the 
likelihood function actually employed by P96 (Eq. 16 of 
their paper) is an approximated shape of the formal ex- 
pression, though they mention that maximizing the sim- 
plified likelihood is roughly equivalent to maximizing the 
formal one. We do not investigate how these differences 
affect the accuracies of redshift estimation. However, the 
results are obviously different. For our case, the true and 



estimated redshifts agree well and a significant system- 
atic deviation does not appear up to at least z=0.5. On 
the other hand, the redshift estimated with the method 
by P96 tends to be systematically smaller than the true 
value (see Fig. 14 of their paper) . Considering the different 
color band, limiting magnitude, and Hubble constant be- 
tween this work and P96, 0=0.5 for our case corresponds 
to z '^0.7 and z ^1.0 for the cases of V4 and /4 bands, re- 
spectively, in their paper. At 2;true=l-0 in the lower panel 
(for li band) of the Fig. 14 of P96's paper, the discrepancy 
between the true redshift and the mean value of the esti- 
mated ones are no less than 0.2, while that for our method 
is much less than 0.01 at ^;=0.50 as shown in Fig. 6a. 

The second difference is binning procedure. We bin 
the galaxies with their positions and magnitudes while 
P96 did not. Binning procedure significantly reduces the 
processing time (down to a tenth) in the same compu- 
tational environment. This is crucial for constructing an, 
especially, huge cluster catalog in which such techniques 
can display their real worth. 

4. Cluster survey in the NGP region 

In Sect. 3, we have examined the performance of this 
method with a well-behaved model cluster. Further tests 
with real galaxy data are needed for putting the method 
to practical use. Here we perform a cluster survey with 
5.3 deg^ data of the North Galactic Pole region in the B 
band, which were obtained with our Mosaic CCD Cam- 
era 1 (hereafter MCCDl) attached to 1.05m Schmidt tele- 
scope at Kiso Observatory, Japan. 

4.1. Observation 

The observation was made from March 16th to 18th in 
1994 at Kiso Observatory. MCCDl, consisting of 2 x 8 
TC215 CCDs, was attached to 1.05m Schmidt telescope. 
The CCDs have 1000x1018 pixels and the pixel size is 
12/imxl2/im. This corresponds to the scale of 0.75 arc- 
sec/pixel at the prime focus of Kiso Schmidt telescope 
(see Sekiguchi et al. 1992 for more details). In MCCDl, 
CCD chips are placed with large intervals between them. 
Therefore we have to take 15 exposures to obtain data for 
a contiguous region on the sky. 

The data are centered at {a,6) = (13^09™. 1, -|-29° 
48.3') (J2000.0), covering 1.7x3.4 dcg2 with 15 exposures. 
Unfortunately, seeing was poor amounting up to 6.0 arc- 
seconds. A chip is out of work and the data lack in the 
south-eastern corner of about 0.3 deg^, hence the actual 
observed area is 5.3 deg^. 

4-2. Galaxy catalog construction 
4.2.1. Data reduction 

The data reduction is executed in a usual way for opti- 
cal CCD imaging data. After bias subtraction, flat-fielding 
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and sky subtraction, we measure relative positions and rel- 
ative gains between all pairs of neighboring frames taken 
either with the same CCD or with the adjacent CCDs at 
different exposures, using stars common in both frames, 
to construct a mosaicked image. When matching the im- 
ages, we made positional and flux errors uniformly spread 
over the whole data. Typical seeing size is 3.5-4.0 arcsec- 
onds, but among the 15 exposures, there are some data 
with large seeing (~ 6.0 arcseconds). To keep homogene- 
ity in detecting objects in the whole combined data with 
the same threshold, we convolved all frames with two- 
dimensional Gaussian with appropriate a so that FWHMs 
of PSF at any place in the mosaicked image become the 
same (namely, the largest value of 6.0 arcseconds). The 
detection threshold is set to 25.5 magarcsec"^ in the B 
band. This corresponds to 1.5-3 o-gky above the sky level. 
If more than 10 pixels at which the counts exceed the 
threshold are connected, we regard them as an object. Al- 
together 6822 objects were detected. They consist of stars, 
galaxies, sky noises, and junks. All the above procedures 
are performed almost automatically by the data reduction 
software system developed by our group (Doi et al. 1995). 

The error in astrometry is 0.9 arcsecond in rms (with 
2(7 rejection), the magnitude zero-point error is 0.02 mag. 
in rms, and the random error in magnitude is 0.2 mag. in 
rms (Akiyama 1996). The large random error in magni- 
tude is due to bad weather conditions, namely large fluc- 
tuation of seeing size. However, as described in the last 
section, these errors are within tolerance for our cluster- 
finding technique. 

4.2.2. Star/galaxy discrimination 

The detected objects consist of stars, galaxies, sky noises, 
and junks. We extract galaxies from the objects using the 
photometrical information, namely 'sharpness' of the im- 
age and magnitude. 

Fig. 8 shows the distribution of all objects in the 
'sharpness '-magnitude diagram. 'Sharpness' is defined by 
Ipeak/ i/A/pix, where /peak means the peak count of an ob- 
ject and A'^pix means the number of pixels belonging to 
the object. In Fig. 8, we can recognize a tight sequence, 
which corresponds to stars. The bending of the sequence 
at the bright end reflects the saturation of CCDs. Galax- 
ies, having flatter profile than stars with the same magni- 
tude, are widely distributed in the region below the star 
sequence. Most of sky noises occupy faint and unsharp 
end of the diagram (often in several short sequences) and 
we can eliminate them by simply setting a cut-off magni- 
tude. Other brighter unquestionable junks with quite flat 
profiles, due to bad pixel columns, haloes around bright 
stars, and sometimes loci of artificial satellites, often min- 
gle galaxies. These junks must be carefully checked and 
removed. 

We separate galaxies from the other objects with the 
following boundaries. The first one is a line corresponding 



Fig. 8. Star/galaxy discrimination diagram for the 6822 'ob- 
ject's in the B band MCCD data in the NGP region. The 
border between 'star region' and 'galaxy region' is shown as a 
solid line. 

to Gaussian profiles with (t=1.1(Tpsf, where crpsF is the 
standard deviation of the best-fit Gaussian to the PSF 
that was composed of stellar images. PSFs have usually 
longer tails than that of Gaussian and arc never well-fitted 
with a single Gaussian profile. However, for fainter mag- 
nitudes, outskirts of PSFs become negligible, and an ap- 
proximation with single Gaussian is good enough. The 
second boundary is /peak/-\/-^pix = 300. Actually, some 
objects above this boundary and below the star sequence 
arc blended objects; in most cases, they arc galaxies over- 
lapping with stars. As magnitude goes fainter, the star 
sequence falls and eventually merges into the galaxy ter- 
ritory. And so do sky noises. It is no longer possible to 
discriminate between stars and galaxies. Therefore, our 
galaxy sample must also be restricted by the limit of 
star/galaxy discrimination. We fix the limit to be mumit 
= 21.0, which is the third boundary. 

Finally, we cut off the uneven edge region due to 
the dead CCD chip and select the central 1.7x2.9 dcg^ 
rectangular region which contains 996 galaxies. The two- 
dimensional distribution of these galaxies are shown in 
Fig. 9. North is up and east is to the left. 

4.3. Results 

Fig. 10 shows the 'richness image' of the NGP region for 
zm = 0.20. We can find some cluster candidates as peaks. 
Table 1 lists 18 significant peaks with iVp > 180 at Zfii = 
0.20, and their Cp — zm curves are shown in Fig. 11. In 
Fig. 11, only No. 3 and 7 have a prominent single peak in 
their Cp — Zfn curves. The Cp — zgi curves for the other 
cluster candidates are almost fiat and featureless (No. 4, 
5, 8, 13, and 18) or monotonically descends as the filter 
redshift increases (the others). 
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Fig. 9. The distribution of 996 galaxies in the 1.7x2.9 
deg^ region. Symbol size changes with apparent magnitude. 

The largest symbol corresponds to ms ~ 14.2 and the 
smallest to niB = 21.0. The field center is at {a, 5) = 
(13''08'"55=.3,-|-30°00'47".2) (J2000.0). North is up and east 
is to the left. 

An Cp — Zfii curve which monotonicaUy descends with 
increasing filter redshift does not always mean that the 
redshift of the corresponding cluster candidate is less than 
0.1; in the most cases, Cp — zm curves just keep increas- 
ing and have no peak, or become noisy, as filter redshift 
becomes even smaller. These behaviors arc similar for the 
case of monotonicaUy ascending Cp—zm curve. Thus, most 
of the cluster candidates with flat or monotonicaUy de- 
scending/ascending Cp — zgi curves are spurious. 

We can recognize many cluster candidates gathered 
in the bottom-right region in Fig. 10. However, the area 
includes the north-eastern outskirts of the Coma cluster. 



Fig. 10. 'Richness image' of the NGP galaxy sample of Fig. 
9 for Zfii = 0.20. Solid line shows the contour of the threshold 
richness A'^th = 180. Plus signs indicate positions of cluster 
candidates. 

which correspond to the concentration of bright galaxies 
in the bottom-right region in Fig. 9. No cluster candidates 
in this region have a single peak in their Cp — zm curves, 
implying that most of them may be spurious. The Cp — zm 
curves of the candidates No. 16-18 do not show a single 
peak and their A^p values are too small (less than 100). 
They may also be spurious. 

The most significant cluster candidates are No. 3 and 7. 
These candidates both correspond to a single Abell clus- 
ter A 1677. Splitting into two peaks may be either due to 
the poor quality (for example, the bright limiting magni- 
tude or the inhomogeneity) of the data or due to a possi- 
ble substructure. The measured redshift of A1677 is 0.183 
(AGO) and the Abell richness c is 112, which corresponds 
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to A'' ~ 3000. Our estimates of {z, N) are (0.26, 697.1) for 
No.3 and (0.16, 673.5) for No. 7. There is another cataloged 
cluster. No. 14. This cluster is II Zw 1305.4+2941 (Koo et 
al. 1986 and references therein). It has also been detected 
with X-ray satellites such as Einstein (MS 1305.4+2941 in 
Gioia et al. 1990), ROSAT (IRXS J130749.3+292536 in 
Voges et al. 1996), and ASCA (Ueda 1996). The measured 
redshift of this cluster is 0.241, while our rcdshift estima- 
tion for this cluster gives 0.10. Taking into account the 
poor quality of the data and the brighter limiting magni- 
tude than that of the simulations in Sect. 3, we conclude 
that redshift and richness estimations for these two clus- 
ters are consistent with the cataloged values. 

Let us compare this result with that of intuitive eye 
selection. A glance of the galaxy distribution in Fig. 9 can 
find some other 'somewhat conspicuous' galaxy concentra- 
tions. They are, for example, at (X,Y) = (35,110), (40,90), 
and (60,80). These three clumps seem to be more plausi- 
ble 'clusters' at a glance than the fainter one, for example. 
No. 14 in Table 1 at (X,Y) = (55,55). However, when we 
examine a 'richness image' (Fig. 10), these three appear 
to have much less remarkable peaks than No. 14, which is 
a real cluster. 

Searching for clusters with a simultaneous use of mag- 
nitudes and positions can produce a quite different, and 
more objective result than that by conventional techniques 
using surface density of galaxies only. 

5. Conclusion 

We have developed an objective and automatic cluster- 
finding method. It is a variant of P96 method with some 
improvements. The method uses positions and apparent 
magnitudes of galaxies simultaneously, and detects clus- 
ters by fitting artificial cluster models which contain rcd- 
shift and richness as free parameters by maximizing the 
likelihood function. Therefore redshift and richness of clus- 
ters are estimated as byproducts of detection. Good ac- 
curacies in the estimates of cluster's position, redshift, 
and richness are confirmed by a number of Monte Carlo 
simulations. For clusters at z=0.20 and as rich as the 
Coma cluster, errors in estimating redshift and richness 
are Az ~0.02 and AA'' ~360 (12%), respectively. Spurious 
detection rate of this method is also studied with Monte 
Carlo simulations and is shown to be less than ~10% of 
that by conventional techniques using only surface den- 
sity of galaxies. A cluster survey in the NGP region is 
performed as a test with real data. Despite the poor qual- 
ity of the data, two known real clusters are successfully 
detected. 

At present, it is quite difficult to make a deep X-ray 
survey over a wide area on the sky and to build a large 
catalog of, especially, distant [z > 0.3) clusters though 
some attempts are being made such as SHARC (Collins 
et al. 1997; Burke et al. 1997), RDCS (Rosati et al. 1998), 
RIXOS (Castander et al. 1995), and WARPS (Scharf et 



Table 1. All detected cluster candidates in the NGP region. 



No 


<\(2()()().()) 


()(2000.()) 






Notes. 
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13 05 39.1 


30 51 36.5 


0.10 


157.7 


t 


2 


13 05 43.3 


28 58 33.0 


<0.1? 


<127.8 


t 


3 


13 05 47.5 


31 14 38.1 


0.26 


697.1 


A1677 
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13 05 48.7 


28 47 33.5 


<0.1? 


<84.7 


t 
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13 05 49.2 


28 37 03.5 


<0.1? 


<386.1 


t 
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13 05 59.4 


29 08 04.8 


0.11 


175.9 
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13 06 03.6 


31 20 40.0 


0.16 


673.5 


A1677 
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t 
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13 06 13.5 


28 56 36.0 
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13 06 25.1 
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28 58 40.3 
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124.5 


t 


13 


13 07 17.3 


29 07 11.2 


0.14 


116.0 


t 


14 


13 08 25.7 


29 26 45.0 


0.10 


94.6 


* 
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13 09 18.7 


28 40 45.2 


<0.1? 


<120.8 


t 


16 


13 10 02.5 


30 55 19.7 


<0.1? 


<100.3 


t 


17 


13 10 07.2 


31 06 20.0 


<0.1? 


<88.6 


t 


18 


13 10 55.9 


30 37 47.6 


<0.1? 


<78.6 


t 


t- 
t- 
* - 


may be junk (edge of Coma) 

may be junk 

II Zw 1305.4+2941 









Fig. 11. Peak logaxithmic likelihood as a function of filter 
redshift for the 18 cluster candidates shown in Fig. 10. 
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al. 1997). Thus optical search is almost the only realistic 

way to find a large number of distant clusters. Objective 
and automated methods for finding clusters from optical 
data must be indispensable tools in the near future for 
quickly constructing large, statistically complete cluster 
catalogs from the data covering an extremely wide area, 
for example, those from SDSS (Gunn and Weinberg 1995; 
Okamura 1995), or for compiling catalogs of extremely 
distant clusters up to > 1 from deep imaging data with 
8-lOm class telescopes. 
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A. Relationship between TV and Abell richness 

We adopt A'', the field-corrected number of all galaxies 
brighter than m*-|-5, as the indicator of cluster richness 
in this paper. In most previous studies, however, Abell's 
richness parameter c, which is the field-corrected number 
of galaxies inside the Abell radius with magnitude between 
7713 and ma + 2. is conventionally used. For convenience, 
we estimate the relationship between N and c by simply 
using random values whose probability density function 
obeys a Schechter function. 

For a given value of N, we generate 20 'clusters' and 
count numbers of galaxies with magnitudes between ms 
and 7713 + 2. 

Fig. 12 shows the relationship. The median of 20 c val- 
ues corresponding to a given N is shown by solid line. The 
region between the two dashed lines represents the central 
±1(7 area. The median value of c is well approximated by 
the following power-law relations 

c = 0.46 Ar°-^^ {N ^ 4000) (Al) 
c = 0.33 Ar°-^^ {N ^ 4000) (A2) 

For the Coma cluster c equals 106, which corresponds to 
N ~ 3000. 

The magnitude of the third brightest galaxy is often 
significantly affected by the overlapping galaxies or vari- 
ations in cluster morphology; if a cluster's morphology 
is cD or B in Rood-Sastry classification (Rood and Sas- 
try 1971), ms would be often fainter because the original 
third brightest galaxy may have been merged by the first 
or second brightest galaxy and a fainter galaxy becomes 
the new third brightest galaxy. On the other hand, mg 
would be brighter for L or C clusters because the third 
brightest galaxy may capture smaller ones. The error in 
estimating ms directly affects c and leads eventually to 
some serious systematic error in the evaluation of clus- 
ter richness. On the other hand, includes all galaxies 
brighter than m*+5 and is less affected by those factors. 
We can use N as a, more robust richness indicator than c. 



Fig. 12. Relationship between our richness parameter N and 
Abell richness c. Solid line shows the median value of c of 20 
artificial clusters with a fixed N. Dashed lines show ±la range. 
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